Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System

نویسندگان

Hiroaki Fujii

Yoshiko Yasuda

Hideya Akashi

Yasuhiro Inagami

Makoto Koga

Osamu Ishihara

Masamori Kashiyama

Hideo Wada

Tsutomu Sumimoto

چکیده

RISC-based Massively Parallel Processors (MPPs) often show low efficiency in real-world applications because of cache miss penalty, insufficient throughput of the memory system, and poor inter-processor communication performance. Hitachi's SR2201, an MPP scalable up to 2048 processors and 600 GFLOPS peak performance, overcomes these problems by introducing three novel features. First, its processor, the 150 MHz HARP-1E, solves the cache miss penalty by "pseudo vector processing" (PVP). In PVP, data is loaded by prefetching to a special register bank, bypassing the cache. Second, a multi-bank memory architecture that operates like a pipeline eliminates the memory system bottleneck. Third, the inter-processor communication achieves high performance on the three-dimensional crossbar network, using a "remote DMA transfer" protocol and a hardware-based cache coherency. As the result of these improvements, the SR2201 achieved 220.4 GFLOPS with 1024 processors in the LINPACK benchmark, which is almost 72% of the peak performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An efficient implementation of parallel eigenvalue computation for massively parallel processing

This article describes an e cient implementation and evaluation of a parallel eigensolver for computing all eigenvalues of dense symmetric matrices. Our eigensolver uses a Householder tridiagonalization method, which has higher parallelism and performance than conventional methods when problem size is relatively small, e.g. the order of 10,000. This is very important for relevant practical appl...

متن کامل

Parallel Java for the Hitachi SR 2201

In October of 1997, a year-long collaborative project was started between Hitachi Europe Limited (HEL) and the Edinburgh Parallel Computing Centre (EPCC) at the University of Edinburgh. This project had the goal of producing an environment whereby Java programs may be executed on the Hitachi SR2201 distributed memory multi-processor machine. The two key deliverables from this work are a port of...

متن کامل

Deadlock-Free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201

We have developed a hardware detour path selection facility for the Hitachi SR2201 parallel computer, which uses a multi-dimensional crossbar as an inter-processor network to ensure operating efficiency and high reliability when a part of the network is faulty. When this hardware facility is used, packets are transmitted to their destination along alternative paths to avoid the fault. However, ...

متن کامل

A Methodology for Automatically Tuned Parallel Tridiagonalization on Distributed Memory Vector-parallel Machines

In this paper, we describe an auto-tuning methodology for the parallel tridiagonalization to attain high performance. By searching the optimal set of three parameters for the performance, a highly eecient routine can be obtained automatically. Evaluation of the methodology on the distributed memory parallel machines, the HITACHI SR2201 and HITACHI SR8000, has been provided. The experimental res...

متن کامل

Effective Simulation for the Giga-scale Massively Parallel Supercomputer SR2201

A high performance parallel network simulation environment was developed in the SR2201 project. The SR2201 is one of the highest performance massively parallel supercomputers in the world. The enhanced simulation algorithm achieved a 2.4 times increase in simulation speed compared with conventional simulation methodology. A 98% detection rate for all design errors before physical design contrib...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1997

Architecture and Performance of the Hitachi SR2201 Massively Parallel Processor System

نویسندگان

چکیده

منابع مشابه

An efficient implementation of parallel eigenvalue computation for massively parallel processing

Parallel Java for the Hitachi SR 2201

Deadlock-Free Fault-tolerant Routing in the Multi-dimensional Crossbar Network and Its Implementation for the Hitachi SR2201

A Methodology for Automatically Tuned Parallel Tridiagonalization on Distributed Memory Vector-parallel Machines

Effective Simulation for the Giga-scale Massively Parallel Supercomputer SR2201

عنوان ژورنال:

اشتراک گذاری